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(57) A method and system for preserving load bal- 
ancing of the client transactions, for the whole duration 
of the client sessions, in a Web site implemented in the 
form of a cluster of servers is described. The invention 
manages to send only the initial request of each client 
session to the site load balancer thus, greatly enhancing 
the capability of the site to accept new session requests. 
All subsequent requests from a client are forwarded di- 
rectly to the server first selected so that the sessions 
cannot be later broken by the load balancer. The 
scheme works regardless of the fact that the client is 
beyond a proxy or a firewall and greatly contributes to 
the performance of the Web site. 
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Description 

Field of the Invention 

[0001] The present invention deals with the global In- 5 
ternet network and more particularly to those of the In- 
ternet Servers of World Wide Web (WWW) sites organ- 
ized as a cluster or group of servers forming a single 
entity. 

10 

Background of the Invention 

[0002] The I nternet is the world's largest network, and 
it has become essential in organizations such as gov- 
ernment, academia and commercial enterprises. Trans- is 
actions over the Internet are becoming more common, 
especially in the commercial arena. The information that 
organizations have on their traditional or legacy busi- 
ness applications may now be published arid accessible 
to a wide audience. This access may include a person 20 
checking a bank savings account, making a hotel res- 
ervation or buying tickets for a conceit. Making this in- 
formation or service available for their customers is a 
competitive advantage for any organization. However, 
regardless of the innovation and potential benefits pro- 25 
vided by a company's Internet solution, its value is great- 
ly reduced if the information cannot be accessed in a 
reasonable response time. 

[0003] The load on an Internet site is unlikely to re- 
main constant. The number of accesses on a Web sen/- 30 
er can increase for several reasons. 

1. Most companies add their Web site's address to 
television, radio and print advertising and to product 
catalogues and brochures. Therefore, awareness 35 
of the Web site grows. 

2. As time passes, the Web site gains better cover- 
age in the on-line search engines. 

3. Assuming the site is providing useful information 

or a useful service to customers, repeat visitors 40 
should increase. 

4. Most Web sites begin simply, with fairly modest 
content, mostly text, with some images. As the site 
designers grow in confidence, more resources are 
allocated, and as Web users in general increase 45 
their modem speeds, most sites move towards rich- 
er content. Thus, not only do hit rates increase, but 
the average data transfer per hit also rises. 

5. Most sites begin as presence sites providing cor- 
porate visibility on the Internet and making informa- so 
tion about the company available to potential cus- 
tomers. Most present sites use predominantly static 
Hyper Text Marked-up Language or HTML pages. 
Static pages are generated in advance and stored 

on disk. The server simply reads the page from the ss 
disk and sends it to the browser. However, many 
companies are now moving towards integration ap- 
plications that allow users of the Web site to directly 



access information from the company's existing ap- 
plications. This could include checking the availa- 
bility of products, querying bank account balances 
or searching problem databases. These applica- 
tions require actual processing on the server sys- 
tem to dynamically generate the Web page. This 
dramatically increases the processing power re- 
quired in the server. 

[0004] There are several ways to deal with the growth 
of an Internet site like purchasing an initial system that 
is much too large. This is one way to deal with Web site 
growth; however, most companies are not willing to in- 
vest large sums of money in a system that is much larger 
than they require particularly since the benefits that they 
will gain from the site have yet to be proven. Most prefer 
to purchase a minimal initial system and to upgrade in 
the future as the site demonstrates its worth to the com- 
pany. In this realm of solutions load balancing between 
multiple servers is very often used. In this case, the bad 
for the overall site is balanced between multiple servers. 
This allows scaling beyond the maximum performance 
available from a single system and allows for easy up- 
grading by simply installing additional servers and 
reconfiguring the cluster to use the additional servers. 
This solution can also provide the added benefit of high- 
er server availability. The load-balancing software can 
automatically allow for the failure of a single server and 
balance the load between the remaining sites. Because 
the Internet model allows the distribution of services 
among different servers, called Internet Servers it is de- 
finitively feasible not to tie an application to one specific 
server. Instead, the service belongs to a group of serv- 
ers; so an additional computer can be added or removed 
when necessary. However, grouping the set of servers 
in a single entity, implies that load balancing is well per- 
formed between these servers so as to actually achieve 
optimum performance. A discussion on this and more 
on load balancing can be found for example in a paper 
by Dias et al., "A Scalable and Highly Available Web 
Server", Digest of Papers, Compcon 1996, Technolo- 
gies for the Information Superhighway, Forty-first IEEE 
Computer Society International Conference (Cat. No. 
96CB35911), pp. 85-92, Feb. 1996. 
[0005] Load-balancing products have made their way 
to the market. IBM's eNetwork Dispatcher (eND) is one 
of those products now commercially available. It creates 
the illusion of having just one server by grouping sys- 
tems together into a cluster that behaves as a single, 
virtual server. The service provided is no longer tied to 
a specific server system; so it is possible to add or re- 
move systems from the cluster, or shutdown systems 
for maintenance, while maintaining continuous service 
for the clients. The balanced traffic among servers 
seems for the end users to be a single, virtual server. 
The site thus appears as a single IP (Internet Protocol) 
address to the world. All requests are sent to the IP ad- 
dress of the a Network Dispatcher machine, which de- 
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cides with each client request which server is the best 
one to accept requests, according to certain dynamically 
set weights. Network Dispatcher routes the clients' re- 
quest to the selected server, and then the server re- 
sponds directly to the client without any further involve- 
ment of eND. This makes it possible to have a small 
bandwidth network for incoming traffic (like Ethernet or 
token ring) and a large bandwidth network for outgoing 
traffic (like ATM - Asynchronous Transfer Mode or FDDI 
- Fiber Distributed Data Interface or Fast Ethernet). It 
can also detect a failed server and route traffic around 
it. General information on the way of performing load 
balancing between multiple servers and on eND product 
can be found in a 'Redboook 1 by IBM published by the 
Austin, Texas center of the International Technical Sup- 
port Organization (ITSO) and untitled "Load-Balancing 
Internet Servers" under the reference SG24-4993 on 
December 1997. 

[0006] Those products are great to achieve what they 
have been designed for, i.e., load-balancing and indeed 
allow to build scalable Web site capable of coping with 
a rapidly growing demand for higher traffic. However, 
they have created their own difficulties too. Because 
there are now numerous sophisticated Web servers that 
allow to handle dynamic Web pages they need to be 
session-aware for every user accessing their service. 
Several techniques indeed exist to keep track of the con- 
text in which a particular user is accessing a Web server. 
They are of two kinds: 

* the contextual data is circulating, back and forth, in 
the IP packets exchanged between the client and 
the servers. For example, it can be part of the Web 
pages themselves. 

• or the contextual data is kept in the Web server ac- 
tive memory or on disk. This second solution is nec- 
essary whenever the amount of data needed to de- 
fine each session context is too large to be practi- 
cally transported over the network with each trans- 
action between the client and the servers. 

[0007] Then, load-balancing products such as eND 
manage not to dispatch randomly the traffic to the serv- 
ers of their cluster. They keep track of the user requests 
which must end up into the same server while a session 
is active. To achieve this, the usual technique, well 
known from the art, consists in utilizing the IP address 
of the client. Then, each transaction coming from the 
same IP address is dispatched to the same server. 
[0008] However, this does not fit in the now frequent 
situations where the end user and the server are on ei- 
ther side of a proxy, socks or fire-wall. All those devices, 
part of the Internet, are intended to deal with specific 
problems like, for example, the isolation of an intranet 
that must not be freely accessible by outsiders without 
any control thus, leading to place a fire-wall at the in- 
tranet gateway. Or a proxy, so as the users within an 
intranet are seeing the whole Internet through a com- 



mon gateway device, somehow caching it, in an attempt 
to achieve overall better performance. In these situa- 
tions, the client IP address is not actually known by the 
network dispatcher which establishes in fact a TCP con- 

s nection (the Transport Control Protocol of the Internet 
protocol suite) with the proxy, the socks or the fire-wall 
rather than directly with the end-user. Therefore, the net- 
work dispatcher is no longer session aware that is, has 
no information that would allow it to decide that a par- 

10 ticular end-user, for example located beyond a proxy, 
that was engaged into a transaction such as buying a 
product from a virtual shop, an application that was first 
selected by the dispatcher on a particular server in the 
cluster of servers, has not yet completed. Then, further 

*5 requests from the end-user, sometimes occuring after a 
long pause, could be dispatched differently by the net- 
work dispatcher just because it does its job of balancing 
the traffic towards a less busy server within the cluster 
with the obvious consequence that the new server is not 

20 aware of the transaction in progress. 

[0009] And there is another undesirable effect of hav- 
ing the end-users beyond a proxy for a load balancer. 
All the individual users within a group, for example an 
intranet then, appear to the load balancer as a single 

25 user because their IP address is the same since it is the 
one of the proxy or fire-wall. Therefore, the load balanc- 
er which tends to maintain the dispatching of a given 
user towards the same server, in an attempt not to break 
sessions, at least while an inactivity timer has not 

30 elapsed, keep sending the traffic of the whole intranet 
to the same server. This seriously goes against what this 
kind of product is trying to achieve, i.e., load balancing. 
Although the individual users within a group would cer- 
tainely enjoy not being served by the sometimes same 

35 busy server, because they are seen as being a single 
client by the load balancer, it is no longer possible to 
discriminate the individual users. 

Object of the Invention 

40 

[0010] Thus, it is a broad object of the invention to 
overcome the shortcomings, as noted above, of the prior 
art and therefore enabling a particular server, within a 
cluster of servers, to continue serving a given end-user 
45 while the current session is active and being able to dis- 
criminate the individual users within a group (intranet) 
so as to maintain a good load balancing over the cluster 
of servers. 

[0011] It is a further object of the invention to improve 
50 the efficiency of the load balancer by requiring only one 
interrogation per session thus, freeing it to dispatch 
even more transactions over the cluster of servers. 
[0012] Further advantages of the present invention 
will become apparent to the ones skilled in the art upon 
55 examination of the drawings and detailed description. It 
is intended that any additional advantages be incorpo- 
rated herein. 
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Summary of the Invention 

[001 3] A method and system for preserving load bal- 
ancing of the client transactions, for the whole duration 
of the client sessions, in a Web site comprising a plural- 
ity of servers and including a load balancer accessed 
from a plurality of clients is described. Upon receiving a 
client initial request the load balancer selects a particu- 
lar server among the plurality of servers. Then, the initial 
request is forwarded to the selected server which is- 
sues, towards the client, a response uniquely referenc- 
ing the selected server. Hence, all subsequent requests 
from the client are forwarded directly to the uniquely ref- 
erenced server 

[0014] The method of the invention allows to send on- 
ly the initial request of a client session to the load bal- 
ancer of a Web site organized as a cluster of servers 
thus, greatly enhancing the capability of the site to ac- 
cept new session requests. 

[0015] Moreover, the client sessions being effected 
directly between the client and the server initially select- 
ed cannot be later broken by the load balancer. 
[0016] Finally, the scheme works regardless of the 
fact that the client is beyond a proxy or a firewall, on 
contrary of the previous art, that could only rely on the 
IP address of the client request to perform load balanc- 
ing and to decide if a session has ended or not, leading 
to imperfect results both in terms of load balancing and 
broken sessions, especially when the actual IP address 
of the end user is masked by one of the here above men- 
tioned devices. 

Brief Description of the Drawings 
[0017] 

Figure 1 Describes the prior art where a load balanc- 
er is dispatching end-user requests over a 
cluster of servers however, being prevent- 
ed of always fully taking advantage of the 
computing resources of the servers when 
there are too many requests in progress. 

Figure 2 Depicts the common case when a Proxy or 
a Fire-wall is on the way between the end- 
users and the cluster of servers thus, pre- 
venting load balancer to perform a fair dis- 
patching of the incoming requests. Also, 
the breaking of end-user sessions in case 
of inactivity is described. 

Figure 3 describes the general solution brought by 
the invention to the shortcomings of the 
previous art. 

Figure 4 describes a particular implementation of 
the invention insuring always a fair balanc- 
ing of the workload over all the servers in a 
cluster of servers and guaranteeing that no 
session is broken. 

Figure 5 describes an alternate implementation of 



the invention with the same advantages. 

Detailed Description of the Preferred Embodiment 

5 [0018] Figure 1 illustrates the prior art where a load 
balancer [1 00] manages to group several servers [105], 
[110] and [115] together into a cluster [1 20] that behaves 
as a single, virtual server so that the service provided is 
no longer tied to a specific server system. The balanced 

io traffic among servers seems for the end users, like 
[1 25], to be a single, virtual server. The site thus appears 
as a single DNS (Domain Name System) name and IP 
address to the world [130]. All requests, such as [140], 
issued from a Web browser, are sent to the IP address 

15 of the Load Balancer machine [1 00], which decides with 
each client request which server is the best one to ac- 
cept requests, according to their respective workloads. 
Hence, load balancer [100] routes the clients' request 
to the selected server and the server responds directly 

20 [1 35] to the client without any further involvement of the 
load balancer. In practice, the load balancer receives the 
IP packets destined to the cluster. These packets have 
a source and a destination address. The destination ad- 
dress is the IP address of the cluster [130]. All servers 

25 in the cluster, i.e. [105], [110] and [115], have their own 
IP address and know the cluster's IP address too. The 
dispatcher system checks which server is less busy and 
routes the packet to that server. The server receives the 
packet and is able to respond directly to the client based 

30 on the source address contained in the packets received 
by the load balancer. However, with this scheme, all 
browser requests are always ending up into the load bal- 
ancer, i.e., the ones of all the sessions in progress plus 
the ones for the new sessions. If too many requests are 

35 converging to the site the bottleneck may become the 
load balancer itself even though there would be enough 
computational resources left within the cluster of servers 
to handle them. 

[0019] Figure 2 illustrates the now frequent case 

40 where a sometimes significant portion of a network, for 
example an intranet [200] shared between a group of 
related users, is beyond a proxy or a fire-wall or any 
equivalent device that filters the packets so that the IP 
addresses of the individual users within the intranet ap- 

45 pears to be the same [210] for those that are outside. 
Then, the load balancer [220] task becomes more diffi- 
cult to carry out because it tends, after a first request 
has been received from one of the users on the intranet 
[200] to keep sending all subsequent ones towards the 

50 same server within the cluster [230], for example [240], 
even though the request is actually coming from a dif- 
ferent user on this intranet. Because the proxy [250] has 
filtered the IP packets from the intranet, load balancer 
[220] is no longer able to discriminate between the indi- 

55 vidual users like [260]. In fact, the intranet may be quite 
large with numerous individual users. If, many of them 
are accessing the same site, for the same type of service 
processed by the cluster of servers [230] there is a good 
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chance that a continuous flow of requests arrives to the 
load balancer. Then, load balancer is bound to deliver 
the requests to the initial selected server even if other 
servers of the cluster, not as busy, could deliver the 
same service. The above case is not unlikely to happen 
just because the intranet is shared, e.g., by the person- 
nel of a specific company say, a financial institution. All 
members may have some interest to consult the same 
type of information during the day say, stock exchange 
rates. Therefore, this kind of situation may tend to pre- 
vent load balancer to spread an equal share of the work- 
load over all servers of a cluster of servers. 
[0020] A second type of problems is encountered if, 
on contrary of the here above just described situation, 
no request is arriving for some time to the load balancer 
so that a significant period of inactivity let think to the 
load balancer that the user session has ended. Then, it 
may decide to reassess load balancing with the arrival 
of another request even though the session is still in 
progress at the viewpoint of the end-user. This, may 
happen when a particular end-user is pausing for a long 
time while the other users on the intranet are not ac- 
cessing the cluster of server [230]. Therefore, a further 
request from the end-user [260] may end up in a differ- 
ent server of the cluster, i.e., not in [240]. The new se- 
lected server is not aware of the transaction in progress 
and the context is lost. Hence, a transaction that in 
progress, e.g., the payment of an item bought from a 
virtual shop is aborted. 

[0021] Figure 3 depicts the general solution to the 
problems induced by the use of a load balancer dis- 
patching incoming Web browser requests over a cluster 
of servers as discussed in the two previous figures. 
Whenever a new request [300] from a Web browser 
[305] is issued it is forwarded to the load balancer [31 0]. 
This, because it is load balancer DNS name and corre- 
sponding IP address (referred to as DNS0 and IPO [312] 
in the particular example of figure 3) which is made pub- 
lic for the service or the set of services advertised for 
the Web site implemented in the form of a cluster of serv- 
ers [320], On contrary of the previous art it makes no 
difference for the invention of receiving the request ei- 
ther directly [326] or through a proxy [325]. Whenever 
the initial request reaches the load balancer [311] it is 
dispatched to one of the servers of the cluster of servers 
[320]. The decision of routing towards one particular 
server, like [313] in this example, is the prime job of the 
load balancer. The metric used to decide which server 
is to be selected at a given instant depends on the de- 
sign of the load balancer which is assumed to collect 
from all the servers, at regular intervals, performance 
information regarding their level of activity. In broad gen- 
eral terms it can be said that the less busy of the servers 
is selected in an attempt to indeed reach the goal of bal- 
ancing the workload equally over all the servers. The 
invention does not, per se, interferes with this process 
which is under the sole responsibility of the load balanc- 
er. However, it contributes dramatically to improve the 
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job of the load balancer by forwarding to it only the initial 
requests, like [300], issued by the Web browser [305] 
when initializing a session as this will become apparent 
in the following. Thus, the request is forwarded, in the 
5 example of figure 3, to server [321 ] that met the criterion 
for being elected to process initial request [311] at time 
it reached the load balancer [31 0] on the basis of the 
performance data that were collected by the load bal- 
ancer from the servers. At this point the rest of the ses- 

10 sion is going to be handled solely by the particular serv- 
er, e.g., [321] without any further implication of the load 
balancer which is then completely free to accept all new 
requests arriving to the site and generally referred to as 
"hits - in the literature that deals with Web site perform- 

15 ances, at least until not all the resources of the cluster 
of servers are completely exhausted. This is a complete 
departure from the prior art where the processing of the 
new hits interfered with the processing of all the ses- 
sions already in progress as illustrated in previous fig- 

20 ures thus, leading to postpone the processing of a new 
request by the load balancer when it is too busy itself 
dispatching the numerous requests of the sessions al- 
ready in progress, even though there may have still 
plenty of computing resources available hence, wasted 

2S in the cluster of servers. The above is made possible 
because each of the servers within the cluster of servers 
has its own unique DNS name et corresponding IP ad- 
dress. For example DNS1 and IP1 for the server [321]. 
Therefore, the server that has been elected by the load 

30 balancer, upon receiving the initial request [313] will re- 
ply directly [330] to the Web browser of the end-user. 
This reply mentions the Uniform Resource Locator or 
URL to be used for the further requests of that session. 
This URL contains the DNS name or the IP address of 

35 the elected server i.e.: DNS1 or IP1. All further requests 
[332] are forwarded directly to the selected server thus, 
freeing the load balancer of dispatching the subsequent 
session requests as mentioned above and insuring that 
all are going to reach the same server for the whole du- 

40 ration of the session. Again, the Web browser and the 
cluster of servers may be on either side of a proxy like 
[325] without impacting, at all, the above scheme on 
contrary of the previous art. 

[0022] Figure 4 illustrates one particular implementa- 
45 Won of the general solution depicted in figure 3. It takes 
advantage of the here above mentioned option that a 
specific response could be issued by the selected server 
for informing the end-user browser of the actual DNS 
name of the server elected to process its session. The 
50 protocol, part of the TCP/IP suite of protocols, used by 
the Web server to transfer hypermedia documents 
across the Internet, known under the acronym of HTTP 
for Hyper Text Transport Protocol, specifically foresee 
the possibility of redirecting a request that was issued 
55 for a specific DNS name to another DNS name for the 
duration of the session. Hence, whenever the selected 
server [400] is receiving a new request [410] it has just 
to respond with a HTTP redirect command [420], either 
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directly or through a proxy [425], destined to the end- 
user browser [430] and carrying the actual DNS name 
of the server in charge [400] so that the rest of the ses- 
sion is going to take place, as required, directly between 
the remote browser and the particular server 
[0023] Figure 5 is another example of how to imple- 
ment the general solution of figure 3 so that to solve the 
shortcomings of the prior art. In this approach the "WEL- 
COME" page [500] of every server, within the cluster of 
servers, supporting a given set of services, is identical 
except for the DSN names [511] or [512] explicitly refer- 
encing the particular DSN address of the server on 
which "WELCOME" page is loaded. It is worth noting 
here that if , for the sake of clarity, this particular exam- 
ple is illustrated with only two servers, i.e., Server_1 
[501] and Server_2 [502] it must obviously be under- 
stood that any number of servers could be considered 
instead and solution still applies. Thus, it is load balanc- 
er responsibility to decide which server is going to proc- 
ess a new arriving request from a remote end-user 
browser by directing it either to ServeM [501] or 
Server_2 [502]. Whichever is elected it forwards its own 
"WELCOME" page [500] to the end-user. Then, when 
the end-user decide to use "FOOBAR", a service offered 
at this site and displayed in the "WELCOME" page 
menu, a new request is sent by the browser which ad- 
dress directly Server__1 or Server_2 depending on the 
DSN address that was contained in the received "WEL- 
COME"page i.e. ../dsnl/.. [511]or ./dsn2A. [512]. From 
that point on the other pages of the "FOOBAR" service 
are referenced relatively to the first page without any 
further reference to the specific server (using relative 
URL's or Uniform Resource Locator). That is, only the 
link is specified as shown in [521 ] and [522]. Therefore, 
all further requests from the end-user browser are 
reaching directly the server that was chosen by the load 
balancer at the beginning of the session allowing to fully 
carry out the solution described in general terms in fig- 
ure 3 thus, enabling all of its advantages. 



issuing from said selected server, towards said 
client, a response uniquely referencing said se- 
lected server; and 

s forwarding directly all subsequent requests 

from said client to said uniquely referenced 
server. 

2. The method according to claim 1 wherein: 
10 said response from said selected server consists, 
before said client initial request is honored, in issu- 
ing a redirection command to said client including 
the unique reference of said selected server. 

is 3. The method according to claim 1 wherein: 

said response from said selected server self -con- 
tains said unique reference of said selected server. 

4. The method according to any one of the previous 
20 claims wherein: 

said client initial request is the only request received 
by said load balancer from said client for the whole 
duration of said client session. 

25 5. The method according to any one of the previous 
claims wherein: 

said client session is processed by said selected 
server until said session is ended by said client. 

30 6. A system, in particular a Web site comprising a plu- 
rality of servers and including a load balancer, com- 
prising means adapted for carrying out the method 
according to any one of the previous claims. 

35 7. A computer-like readable medium comprising in- 
structions for carrying out the method according to 
any one of the claims 1 to 5. 



40 



Claims 



1 . A method for preserving load balancing of the client 
transactions, for the whole duration of the client ses- 45 
sions, in a Web site comprising a plurality of servers 
and including a load balancer accessed from a plu- 
rality of clients, said method comprising the steps 
of: 

50 

receiving a client initial request by said load bal- 
ancer; and 

selecting from said load balancer a particular 
server among said plurality of servers; and $5 

forwarding said initial request to said selected 
server; and 
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